A Scalable Distributed Search Engine for Fresh Information Retrieval
نویسندگان
چکیده
We have developed a distributed search engine, Cooperative Search Engine (CSE) to retrieve fresh information. In CSE, a local search engine located in each web server makes an index of local pages. And, a Meta search server integrates these local search engines to realize a global search engine. In such a way, the communication delay occurs at retrieval. So, we have developed several speedup techniques in order to realize real time retrieval. In addition, the meta server is a single point of failure in CSE. So, we introduce redundancy of the meta search server increase availability of CSE. In this paper, we describe scalability and reliability of CSE and their evaluations.
منابع مشابه
Temporal ranking for fresh information retrieval
In business, the retrieval of up-to-date, or fresh, information is very important. It is difficult for conventional search engines based on a centralized architecture to retrieve fresh information, because they take a long time to collect documents via Web robots. In contrast to a centralized architecture, a search engine based on a distributed architecture does not need to collect documents, b...
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کاملText Based Approaches for Content Based Image Retrieval in a P2P Network
The tremendous growth of digital multimedia content on the web requires scalable, efficient, and effective information retrieval mechanisms. Handling such large collections of data in a centralized way requires costly high bandwidth connectivity and powerful servers. This establishes the need of distributed architectures, such as peer-to-peer systems, that allow sharing of data management and s...
متن کاملDesign and Implementation of Scalable, Fully Distributed Web Crawler for a Web Search Engine
The Web is a context in which traditional Information Retrieval methods are challenged. Given the volume of the Web and its speed of change, the coverage of modern web search engines is relatively small. Search engines attempt to crawl the web exhaustively with crawler for new pages, and to keep track of changes made to pages visited earlier. The centralized design of crawlers introduces limita...
متن کاملA Scalable Semantic Indexing Framework for Peer-to-Peer Information Retrieval
The exponential growth of data demands scalable and adaptable infrastructures for indexing and searching a huge amount of data sources with high accuracy and efficiency. Existing centralized search engines are not scalable and suffer from single-point-offailures. The recent work on P2P index construction partitions the document vectors either randomly or statically, making it difficult to trade...
متن کامل